You can try reasoning models, text generation, image generation, speech recognition, object detection, and more - without setting up complex infrastructure.

Running a Model

1

Head over to the Open Source Inferencing tab from the left menu

This will show you a list of all available Models available for Open Source Inferencing
2

Use the Search Bar

You can search for any specific model, if you have something in your mind.
You can even use the filters to filter out relevant models, like Text Generation, Image Generation, Code Generation, Text to Video, Image text to Text, Speech to Text, Object Detection, Tex to 3D, Reasoning Models & more
3

Browse from Available Models

You can even browse from the models displayed, these are updated everytime a new model comes into existence
4

Click on the Model Card

Doing this will open a new window, with that model which you selected to be put to use. You can see the selected model & the links to the relevant model page
5

Set the Max New Tokens

The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt
6

Set the Temperature

Lowering results in less random completions. As the temperature approaches zero, the model will become deterministic and repetitive. This allows you to controll the randomness of the model
7

Set the Top P

Only tokens within the top top_p% probability are considered
8

Set the Repetition Penalty

The parameter for repetition penalty. 1.0 means no penalty
9

User Prompt

This is the place for you to put in the prompt. The more better & descriptive the prompt is the more better results you can expect
10

Click the Run button

Once you have set all relevant parameters & reviewed the prompt, you can click the run button to get started
11

Copy / Download / Save the Response

Depending on what model you used, if its text, you can copy it. If its image, you can download it & so on
Choose reasoning models for structured, multi-step problem solving, text generation models for chatbots, summarization, and content creation, and code models for programming assistants or code generation tasks. Use image and video generation models for creative workflows, and adjust parameters like temperature, max tokens, and top_p to fine-tune results for your use case. For production workloads, it is recommended to integrate via the API rather than relying on the playground UI, ensuring better scalability and automation.
Please note it takes some time for the model to initialize & get live. Don’t reload or refresh your browser during this time. If that’s done the work might get lost & can result in loss of credits incurred. Once live you can expect faster results

Inference Parameters

When running open-source models, you can customize the output using several parameters. These control length, creativity, diversity, and repetition in generated responses.

Max New Tokens

  • What it is: The maximum number of tokens (words, characters, or subwords depending on the tokenizer) the model can generate in a single response.
  • Effect: Controls response length. Higher values allow longer outputs but may increase cost and response time.
  • Typical Range: 50 - 2000 (depends on model context window).
  • Example:
    • Short answer: 100 tokens
    • Detailed explanation: 500–1000 tokens
    • Long-form content: 2000+ tokens

Temperature

  • What it is: A parameter that controls the randomness/creativity of the model’s output.
  • Effect:
    • Low temperature: deterministic, factual, less creative.
    • High temperature: more creative, diverse, but can produce inconsistent answers.
  • Range: 0 - 2
  • Recommendations:
    • 0.2 - 0.5: factual, technical tasks (QA, coding).
    • 0.7 - 1.0: balanced creativity (summaries, essays, chatbots).
    • 1.2 - 2.0: highly creative (storytelling, brainstorming).

Top P (Nucleus Sampling)

  • What it is: Controls diversity by limiting the next token choices to the smallest set whose cumulative probability is ≥ p.
  • Effect:
    • Lower values: more focused, safer outputs.
    • Higher values: more diverse, open-ended outputs.
  • Range: 0 - 1
  • Recommendations:
    • 0.7 - 0.9: good balance for most text generation tasks.
    • 1.0: considers all possible tokens (maximum diversity).
  • Tip: Usually tuned together with Temperature for best results.

Repetition Penalty

  • What it is: Reduces the likelihood of the model repeating the same phrases or tokens.
  • Effect: Encourages variety in generated text.
  • Range: 1.0 - 2.0
  • Recommendations:
    • 1.0: no penalty (default, natural flow).
    • 1.1 - 1.3: reduces repeated loops (best for chat, long responses).
    • 1.5+: strong penalty, but may make text unnatural.

Quick Recommendations

Key Features

  • Wide Model Catalog: Access a growing list of open-source models including DeepSeek, Llama, CodeGen, Stable Diffusion, Whisper, YOLO, and more.
  • Multiple Categories:
    • Reasoning Models
    • Text Generation
    • Code Generation
    • Image Generation
    • Text-to-Video
    • Text-to-3D
    • Speech-to-Text
    • Object Detection
  • Flexible Pricing:
    • Reasoning models: $0.02 per request
    • Image generation models: $0.05 per request
    • Other AI models: $0.01 per request
    • Starter Pack: 500 requests for just $5
  • Interactive Playground: Run prompts directly in the UI with configurable parameters.
  • API Access: Call the same models via REST API for integration in your own applications.

Available Models

Reasoning Models

  • DeepSeek-R1-Distill-Llama-8B
  • DeepSeek-R1-Distill-Qwen-7B
  • DeepSeek-R1-Distill-Qwen-14B
  • DeepSeek-R1-Distill-Qwen-1.5B

Text Generation

  • Llama-3.1-8B-Instruct
  • Mistral-7B-Instruct-v0.3
  • Falcon-11B
  • Gemma-2B
  • Paligemma-3b-pt-896

Code Generation

  • CodeQwen1.5-7B-Chat
  • Codegemma-7b-it
  • CodeLlama-7b-Instruct-hf

Image Generation

  • Stable-Diffusion-3-Medium-Diffusers
  • Stable-Diffusion-3.5-Large
  • Stable-Diffusion-xl-base-1.0

Other Modalities

  • Whisper-small: Speech-to-Text
  • Yolo-V8: Object Detection
  • AnimateDiff-Lightning-Anime: Text-to-Video
  • AnimateDiff-Lightning-Realistic: Text-to-Video
  • Shap-E: Text-to-3D
  • Phi-3-mini-128k-instruct: Text Generation
  • Phi-3-vision-128k-instruct: Image-to-Text